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Abstract 

We provide a simple and efficient algorithm for the projection operator for weighted ti-norm regu¬ 
larization subject to a sum constraint, together with an elementary proof. The implementation of the 
proposed algorithm can be downloaded from the author’s homepage. 


1 The problem 

In this report, we consider the following optimization problem: 


min - ||x-y|| 2 + J2 di \ 


(1) 


s.t. x 1 = 1, 

where y = [yi ,..., y n ] T £ R ra , di > 0, i = 1,..., n, and 1 is the n-dimensional vector consisting of all l’s. 
This is a quadratic program and the objective function is strictly convex (even though it is non-smooth), so 
there is a unique solution which we denote by x = pi,, x n ] T with a slight abuse of notation. 

Notice if d\ = ■ ■ ■ = d n and the const raint were absent, the pro blem has a closed form solution known 
as the soft-shrinkage operator (see, e.g., [Beck and Teboullel . 20091 ). which is widely used for solving l\- 
regularized problem in learning sparse representations. But our problem © is more involved due to the 
constraint that couples all dimensions of x. Nonetheless, we give an efficient algorithm with time complexity 
0(n\ogn) for this problem using only the KKT theorem. 

Remark f.l. Our motivation for © also comes from sparse coding. I Yu et all (i2009f) propose the local 
coordinate coding (LCC) algorithm for learning sparse representations induced by locality. Given a data 
sample u £ ffi™ and a set of landmark points {v^}^L 1 where Vj £ K", j = 1 ,...,C, the LCC algorithm 
reconstructs u from the landmark points while enforcing the faraway landmark points to contribute less than 
nearby landmark points (or to have smaller reconstruction coefficients). Let the reconstruction coefficient of 
Vj be Wj, j = 1,..., C. Then the optimization problem for these coefficients in LCC is 


C 

i =i 

c 

s.t. j2 u, j = i, 

3=1 


ii u ^ v j 

3=1 


Wn 


( 2 ) 


where A > 0 is some trade-off parameter. The constraint in © ensu res that the repres e ntat ion is translation 
invariant. There are different ways of solving this problem, e.g., 


optimization problem which they solve with Alternating Direction Method of Multipliers (jBo vd et al 


Elhamifar and Vidall (120111) have a simila r 

20lil) . 
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One simp le way of so lving O) is to use the gradient proximal algorithm and its Nesterov’s acceleration scheme 
('see lBeck and Teboullel . 20091 and the reference therein), where one iteratively takes a short gradient step for 
the smooth quadratic term and projects the new estimate with the weighted i\ regularization term subject 
to the sum constraint, where the projection operator solves exactly ©• 


2 The solution 


We solve the problem © using only the KKT theorem ( Nocedal and Wrightl . l2006h . which states the neces¬ 
sary and sufficient conditioiQ satisfied by the solution x. The Lagrangian of © is 


1 o " 

£( x - ol) = - ||x - y|| 1 2 + Y di N + a ( xTl - !), 


(3) 


i= 1 


where a is the Lagrange multipliers associated with the constraint. And the KKT system of this problem is 


Xi - yi + di + a = 0, 

if 

Xi > 0, 

(4a) 

Xi - yi - di + a = 0, 

if 

A 

o 

(4b) 

-di < -yi + a < di, 

if 

Xi = 0, 

(4c) 

n 

^Xi = 1, 

2=1 



(4d) 

where we have used the fact that the sub-differential of |x| is [—1,1] at x = 0 to obtain (j^cl). 

Denote y~ = yi — d t , y+ = yt+di, i = 1,..., n, which can be computed beforehand. We can then rewrite 
© in terms of a: 

a < y~ 


> Xi > 0, 

(5a) 

a > yf 


O 

V 

A 

(5b) 

Vi < a < yt 


> Xi = 0, 

(5c) 

YI (& -«)+ J2 (y? - 

- a) = 1. 

(5d) 


i : Xi >0 i : Xi <0 


Obviously, the Lagrange multiplier a is the key to our problem. Once the value of a is determined, we 
can easily obtain the optimal solution by setting 

Xi = y~ - a if y~ > a, (6a) 

Xi = y+ - a if y} < a, (6b) 

Xi = 0 otherwise. (6c) 

We can sort all dimensions of y~ and y\ together (a total of 2 N scalars) into an ascending 2 -sequence: 

zi < z 2 < ■ ■ ■ < z 2N . (7) 


An important observation is that the z-sequence partitions the real axis into 4iV + 1 disjoint sets, each 
being either a single point set {zj}, j = 1,...,2 N or an open interval of the form (—oo,Zi), (zj,Zj+ 1 ), 
j = 1,..., 2N — 1, or (z 2 n, oo) and the Lagrange multiplier a for the solution must lie in one of them. 

We then test each of the AN + 1 sets as follows. Assuming that a lies in one set, we can use (l5al) (IhH) 
to conjecture the positive, negative, and zero dimensions of a possible solution x. After that, we use (I5dl) to 
compute a hypothesized value a for the Lagrange multiplier, i.e., 

E vT + E vt - 1 

i: Xi >0 i: Xi <0 

“ =- E i+ E i • (8) 

i: Xi >0 i : Xi <0 

1 Strictly speaking, our objective is convex and non-smooth, so the condition is that the zero vector 0 lies in the sub-differential 

at the solution x. 
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If the computed a indeed lies in the assumed set (a point or an open interval), we have a KKT point and 
thus the solution. 

Since the problem m is strictly convex and there exists a unique global optimum, this procedure will find 
the exact solution with no more than 47V +1 tests. We can do this efficiently by sorting y~ and yf separately 
(0(n\ogn) operations) and gradually merging the two sorted sequences (an 0{n) operation). Therefore the 
total cost of our procedure for solving m is of order (D(nlogn). 

Algorithm [T| gives the detailed pseudocode for solving ([!]), whose MATLAB and C++ implementation can 
be downloaded at https://eng.ucmerced.edu/people/wwEuig5. 
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Algorithm 1 Pseudo-code of our projection operator for di- 
input: y £ R" and d = [di,..., d n ] where di > 0, i — 1,..., n. 

Sort y - d into y~: yf < ylf < • • • < Vn ■ And sort y + d into y + : yt < yt < ■ ■ ■ < y£. 

i <— 1 , j <— 1 % i/j index of the dimension of y _ /y + that will be merged next. 

% S1/S2 stores the sum of dimensions of y _ /y + that are strictly greater/smaller than the current estimate of a. 
si <— Vi ’ S 2 0, t <— n % t is the number of nonzero dimensions of the hypothesized x. 

if (si + s 2 ) <t ■ Hi then 

a y- (si + s 2 )/f; return % a < yf , all dimensions of x are positive, 

end if 

while true do 

% Test a single point set. 

if Vf < Vj then 

k y- i % y~ is the next value in the z-sequence. 

while (y t ~ = y~) && (fc < n) do 

si <— si — ti— t — 1, ki—k+1 % Skip the contiguous block of identical dimensions in y~. 

end while 

if (si + S 2 — 1) = t ■ y~ then 

a -f— y~; return % a happens to lie in a single point set. 

else 

left <— y~, i «— k % Otherwise, a lies in a open interval with left boundary left. 

end if 
else 

if yi > vt then 

% yf is the next value in the z-sequence. 

if (si + S 2 — 1) = t ■ yt then 

ay- yt', return % a happens to lie in a single point set. 

else 

left y- yt % Otherwise, a lies in a open interval with left boundary left. 

while (yt = left ) && (j < n ) do 

S2 y- S2 + yt i t t + 1, j y- j + 1 % Skip the contiguous block of identical entries in y + . 

end while 
end if 
else 

k y- i % y~ = yt is the next value in the ^-sequence. 

while (y~jf = y~) && (k < n ) do 
si y- si — 24 ", t y-1 — 1 , k y- k + 1 

end while 

if (si + S 2 — 1) = t ■ y~ then 
a y- yf; return 
else 

left y- y~, i y- k 

while (yt = left) && (j < n) do 

s 2 y- s 2 + yt, t f -1 + 1 , j y- j +1 

end while 
end if 
end if 
end if 

% Find the right boundary of the open interval and test if it contains a. 

if yf < vt then 

right y- y~ 

else 

right y- yt 

end if 

if t ■ left < (si + S 2 — 1) && t ■ right > (si + S 2 — 1) then 

a <— (s 1 + S 2 — 1 )/t; return % a lies in the open interval (left,right). 

end if 
end while 

Output: a is the Lagrange multiplier of the problem © , use to obtain x. 
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